Gradient-Based Optimization of Hyperparameters
نویسنده
چکیده
Many machine learning algorithms can be formulated as the minimization of a training criterion that involves a hyperparameter. This hyperparameter is usually chosen by trial and error with a model selection criterion. In this article we present a methodology to optimize several hyperparameters, based on the computation of the gradient of a model selection criterion with respect to the hyperparameters. In the case of a quadratic training criterion, the gradient of the selection criterion with respect to the hyperparameters is efficiently computed by backpropagating through a Cholesky decomposition. In the more general case, we show that the implicit function theorem can be used to derive a formula for the hyperparameter gradient involving second derivatives of the training criterion.
منابع مشابه
Adaptive optimization of hyperparameters in L2-regularised logistic regression
We investigate a gradient-based method for adaptive optimization of hyperparameters in logistic regression models. Adaptive optimization of hyperparameters reduces the computational cost of selecting good hyperparameter values, and allows these optimal values to be pinpointed more precisely, as compared to an exhaustive search of the hyperparameter space.
متن کاملContinuous Regularization Hyperparameters
Hyperparameter selection generally relies on running multiple full training trials, with hyperparameter selection based on validation set performance. We propose a gradient-based approach for locally adjusting hyperparameters during training of the model. Hyperparameters are adjusted so as to make the model parameter gradients, and hence updates, more advantageous for the validation cost. We ex...
متن کاملAn Efficient Method for Gradient-Based Adaptation of Hyperparameters in SVM Models
We consider the task of tuning hyperparameters in SVM models based on minimizing a smooth performance validation function, e.g., smoothed k-fold crossvalidation error, using non-linear optimization techniques. The key computation in this approach is that of the gradient of the validation function with respect to hyperparameters. We show that for large-scale problems involving a wide choice of k...
متن کاملExperiments With Scalable Gradient-based Hyperparameter Optimization for Deep Neural Networks
Gradient-based hyperparameter optimization algorithms have the potential to scale to numbers of individual hyperparameters proportional to the number of elementary parameters, unlike other current approaches. Some candidate completions of DrMAD, one such algorithm that updates the hyperparameters after fully training the parameters of the model, are explored, with experiments tuning per-paramet...
متن کاملHyperparameter optimization with approximate gradient
Most models in machine learning contain at least one hyperparameter to control for model complexity. Choosing an appropriate set of hyperparameters is both crucial in terms of model accuracy and computationally challenging. In this work we propose an algorithm for the optimization of continuous hyperparameters using inexact gradient information. An advantage of this method is that hyperparamete...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Neural computation
دوره 12 8 شماره
صفحات -
تاریخ انتشار 2000